task similarity
- Research Report > New Finding (0.30)
- Research Report > Experimental Study (0.30)
Robust Knowledge Transfer in Tiered Reinforcement Learning
In this paper, we study the Tiered Reinforcement Learning setting, a parallel transfer learning framework, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task to reduce the exploration risk of the latter while solving the two tasks in parallel. Unlike previous work, we do not assume the low-tier and high-tier tasks share the same dynamics or reward functions, and focus on robust knowledge transfer without prior knowledge on the task similarity. We identify a natural and necessary condition called the ``Optimal Value Dominance'' for our objective. Under this condition, we propose novel online learning algorithms such that, for the high-tier task, it can achieve constant regret on partial states depending on the task similarity and retain near-optimal regret when the two tasks are dissimilar, while for the low-tier task, it can keep near-optimal without making sacrifice. Moreover, we further study the setting with multiple low-tier tasks, and propose a novel transfer source selection mechanism, which can ensemble the information from all low-tier tasks and allow provable benefits on a much larger state-action space.
Disentangling and mitigating the impact of task similarity for continual learning
Continual learning of partially similar tasks poses a challenge for artificial neural networks, as task similarity presents both an opportunity for knowledge transfer and a risk of interference and catastrophic forgetting.However, it remains unclear how task similarity in input features and readout patterns influences knowledge transfer and forgetting, as well as how they interact with common algorithms for continual learning.Here, we develop a linear teacher-student model with latent structure and show analytically that high input feature similarity coupled with low readout similarity is catastrophic for both knowledge transfer and retention. Conversely, the opposite scenario is relatively benign. Our analysis further reveals that task-dependent activity gating improves knowledge retention at the expense of transfer, while task-dependent plasticity gating does not affect either retention or transfer performance at the over-parameterized limit. In contrast, weight regularization based on the Fisher information metric significantly improves retention, regardless of task similarity, without compromising transfer performance. Nevertheless, its diagonal approximation and regularization in the Euclidean space are much less robust against task similarity. We demonstrate consistent results in a permuted MNIST task with latent variables. Overall, this work provides insights into when continual learning is difficult and how to mitigate it.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
- Information Technology > Artificial Intelligence > Cognitive Science (0.67)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > China (0.04)
- Workflow (0.68)
- Research Report > New Finding (0.46)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Ventura County > Thousand Oaks (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
approach to study GNN designs, the first quantitative analysis for GNN task similarity, and offers rigorous findings via 2
We thank the reviewers for their constructive feedback. We thank R2 and R3 for raising that our paper lacks theoretical analysis. LU activation significantly improves GNN performance. We will add these new discussions to the revised paper. We thank reviewers for suggesting other design dimensions to explore.
Supplementary: CAM-GAN: Continual Adaptation Modules for Generative Adversarial Networks
Our approach leverages feature space for style modulation to adapt to the novel task. We train our model on various datasets to show the effectiveness of our approach in generating high-dimensional and diverse domains images in a streamed manner. Due to limited space, we could only demonstrate part of the generated images in the main paper(Sec.2.3). We inherit GAN architecture from "Which Training Methods for GANs do actually Converge?"(GP-GAN) We select GP-GAN architecture as it has been very successful in generating quality samples in high-dimensional spaces, by providing stable training.
- Oceania > Australia > New South Wales > Sydney (0.04)
- Asia > Middle East > Saudi Arabia (0.04)
- Asia > Middle East > Jordan (0.04)